Search Result

Select

Deep neural network compression algorithm based on combined dynamic pruning

ZHANG Mingming, LU Qingning, LI Wenzhong, SONG Hu

Journal of Computer Applications 2021, 41 (6): 1589-1596. DOI: 10.11772/j.issn.1001-9081.2020121914

Abstract （323）

PDF （1131KB）（316）

Save

As a branch of model compression, network pruning algorithm reduces the computational cost by removing unimportant parameters in the deep neural network. However, permanent pruning will cause irreversible loss of the model capacity. Focusing on this issue, a combined dynamic pruning algorithm was proposed to comprehensively analyze the characteristics of the convolution kernel and the input image. Part of the convolution kernels were zeroized and allowed to be updated during the training process until the network converged, thereafter the zeroized kernels would be permanently removed. At the same time, the input images were sampled to extract their features, then a channel importance prediction network was used to analyze these features to determine the channels able to be skipped during the convolution operation. Experimental results based on M-CifarNet and VGG16 show that the combined dynamic pruning can respectively provide 2.11 and 1.99 floating-point operation compression ratios, with less than 0.8 percentage points and 1.2 percentage points accuracy loss respectively compared to the benchmark model (M-CifarNet、VGG16). Compared with the existing network pruning algorithms, the combined dynamic pruning algorithm effectively reduces the Floating-Point Operations Per second (FLOPs) and the parameter scale of the model, and achieves the higher accuracy under the same compression ratio.

Reference | Related Articles | Metrics

Select

Accurate object tracking algorithm based on distance weighting overlap prediction and ellipse fitting optimization

WANG Ning, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2021, 41 (4): 1100-1105. DOI: 10.11772/j.issn.1001-9081.2020060869

Abstract （355）

PDF （2560KB）（302）

Save

In order to solve the problems of Discriminative Correlation Filter(DCF) tracking algorithm such as model drift, rough scale and tracking failure when the tracking object suffers from rotation or non-rigid deformation, an accurate object tracking algorithm based on Distance Weighting Overlap Prediction and Ellipse Fitting Optimization(DWOP-EFO) was proposed. Firstly, the overlap and center-distance between bounding-boxes were both used as the basis for the evaluation of dynamic anchor boxes, which can narrow the spatial distance between the prediction result and the object region,easing the model drift problem. Secondly,in order to further improve the tracking accuracy,a lightweight object segmentation network was applied to segment the object from background, and the ellipse fitting algorithm was applied to optimize the segmentation contour result and output stable rotated bounding box, achieving accurate estimation of the object scale. Finally, a scale-confidence optimization strategy was used to realize gating output of the scale result with high confidence. The proposed algorithm can alleviate the problem of model drift, enhance the robustness of the tracker, and improve the accuracy of the tracker. Experiments were conducted on two widely used evaluation datasets Visual Object Tracking challenge(VOT2018) and Object Tracking Benchmark(OTB100). Experimental results demonstrate that the proposed algorithm improves Expected-Average-Overlap(EAO) index by 2.2 percentage points compared with Accurate Tracking by Overlap Maximization(ATOM) and by 1.9 percentage points compared with Learning Discriminative Model Prediction for tracking(DiMP). Meanwhile, on evaluation dataset OTB100, the proposed algorithm outperforms ATOM by 1.3 percentage on success rate index and shows significant performance especially on attribute of non-rigid deformation. the proposed algorithm runs over 25 frame/s averagely on evaluation datasets which realizes real-time tracking.

Reference | Related Articles | Metrics

Select

Multi-level feature enhancement for real-time visual tracking

FEI Dasheng, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2020, 40 (11): 3300-3305. DOI: 10.11772/j.issn.1001-9081.2020040514

Abstract （323）

PDF （2493KB）（306）

Save

In order to solve the problem of Fully-Convolutional Siamese visual tracking network (SiamFC) that the tracking target drifts when the similar semantic information interferers occur, resulting in tracking failure, a Multi-level Feature Enhanced Siamese network (MFESiam) was designed to improve the robustness of the tracker by enhancing the representation capabilities of the high-level and shallow-level features respectively. Firstly, a lightweight and effective feature fusion strategy was adopted for shallow-level features. A data enhancement technology was utilized to simulate some changes in complex scenes, such as occlusion, similarity interference and fast motion, to enhance the texture characteristics of shallow features. Secondly, for high-level features, a Pixel-aware global Contextual Attention Module (PCAM) was proposed to improve the localization ability to capture long-range dependence. Finally, many experiments were conducted on three challenging tracking benchmarks:OTB2015, GOT-10K and 2018 Visual-Object-Tracking (VOT2018). Experimental results show that the proposed algorithm has the success rate index on OTB2015 and GOT-10K better than the benchmark SiamFC by 6.3 percentage points and 4.1 percentage points respectively and runs at 45 frames per second to achieve the real-time tracking. The expected average overlap index of the proposed algorithm surpasses the champion in the VOT2018 real-time challenge, that is the high-performance Siamese with Region Proposal Network (SiamRPN), which verifies the effectiveness of the proposed algorithm.

Reference | Related Articles | Metrics

Select

Mixed-order channel attention network for single image super-resolution reconstruction

YAO Lu, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2020, 40 (10): 3048-3053. DOI: 10.11772/j.issn.1001-9081.2020020281

Abstract （275）

PDF （3787KB）（435）

Save

For the current channel attention mechanism used for super-resolution reconstruction, there are problems that the attention prediction destroys the direct corresponding relationship between each channel and its weight and the mechanism only considers the first-order or second-order channel attention without comprehensive consideration of the advantage complementation. Therefore, a mixed-order channel attention network for image super-resolution reconstruction was proposed. First of all, by using the local cross-channel interaction strategy, increase and reduction in channel dimension used by the first-order and second-order channel attention models were changed into a fast one-dimensional convolution with kernel k, which not only makes the channel attention prediction more direct and accurate but makes the resulting model simpler than before. Besides, the improved first and second-order channel attention models above were adopted to comprehensively take the advantages of channel attentions of different orders, thus improving network discrimination. Experimental results on the benchmark datasets show that compared with the existing super-resolution algorithms, the proposed method has the best recovered texture details and high frequency information of the reconstructed images and the Perceptual Indictor (PI) on Set5 and BSD100 datasets are increased by 0.3 and 0.1 on average respectively. It shows that this network is more accurate in predicting channel attention and comprehensively uses channel attentions of different orders, so as to improve the performance.

Reference | Related Articles | Metrics

Select

Video object segmentation method based on dual pyramid network

JIANG Sihao, SONG Huihui, ZHANG Kaihua, TANG Runfa

Journal of Computer Applications 2019, 39 (8): 2242-2246. DOI: 10.11772/j.issn.1001-9081.2018122566

Abstract （571）

PDF （787KB）（213）

Save

Focusing on the issue that it is difficult to segment a specific object in a complex video scene, a video object segmentation method based on Dual Pyramid Network (DPN) was proposed. Firstly, the one-way transmission of modulating network was used to make the segmentation model adapt to the appearance of a specific object, which means, a modulator was learned based on visual and spatial information of target object to modulate the intermediate layers of segmentation network to make the network adapt to the appearance changes of specific object. Secondly, global context information was aggregated in the last layer of segmentation network by different-region-based context aggregation method. Finally, a left-to-right architecture with lateral connections was developed for building high-level semantic feature maps at all scales. The proposed video object segmentation method is a network which is able to be trained end-to-end. Extensive experimental results show that the proposed method achieves results which can be competitive to the results of the state-of-the-art methods using online fine-tuning on DAVIS2016 dataset, and outperforms other methods on DAVIS2017 dataset.

Reference | Related Articles | Metrics

Select

Real-time visual tracking based on dual attention siamese network

YANG Kang, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2019, 39 (6): 1652-1656. DOI: 10.11772/j.issn.1001-9081.2018112419

Abstract （547）

PDF （800KB）（416）

Save

In order to solve the problem that Fully-Convolutional Siamese network (SiamFC) tracking algorithm is prone to model drift and results in tracking failure when the tracking target suffers from dramatic appearance changes, a new Dual Attention Siamese network (DASiam) was proposed to adapt the network model without online updating. Firstly, a modified Visual Geometry Group (VGG) network which was more expressive and suitable for the target tracking task was used as the backbone network. Then, a novel dual attention mechanism was added to the middle layer of the network to dynamically extract features. This mechanism was consisted of a channel attention mechanism and a spatial attention mechanism. The channel dimension and the spatial dimension of the feature maps were transformed to obtain the double attention feature maps. Finally, the feature representation of the model was further improved by fusing the feature maps of the two attention mechanisms. The experiments were conducted on three challenging tracking benchmarks:OTB2013, OTB100 and 2017 Visual-Object-Tracking challenge (VOT2017) real-time challenges. The experimental results show that, running at the speed of 40 frame/s, the proposed algorithm has higher success rates on OTB2013 and OTB100 than the baseline SiamFC by the margin of 3.5 percentage points and 3 percentage points respectively, and surpass the 2017 champion SiamFC in the VOT2017 real-time challenge, verifying the effectiveness of the proposed algorithm.

Reference | Related Articles | Metrics

Select

Real-time visual tracking algorithm via channel stability weighted complementary learning

FAN Jiaqing, SONG Huihui, ZHANG Kaihua

Journal of Computer Applications 2018, 38 (6): 1751-1754. DOI: 10.11772/j.issn.1001-9081.2017112735

Abstract （496）

PDF （584KB）（290）

Save

In order to solve the problem of tracking failure of the Sum of template and pixel-wise learners (Staple) tracking algorithm for in-plane rotation and partial occlusion, a simple and effective Channel Stability-weighted Staple (CSStaple) tracking algorithm was proposed.Firstly, a standard correlation filter classifier was employed to detect the response value of each channel. Then, the stability weight of each channel was calculated and multiplied to the weight of each layer to obtain correlation filtering response. Finally, by integrating the response of the color complementary learner, the final response result was obtained, and the location of the maximum value in the response was the tracking result. The proposed algorithm was compared with several state-of-the-art tracking algorithms including Channel and Spatial Reliability Discriminative Correlation Filter (CSR-DCF) tracking, Hedged Deep Tracking (HDT), Kernelized Correlation Filter (KCF) Tracking and Staple. The experimental results show that, the proposed algorithm performs best in the success rate, it is 2.5 percentage points higher and 0.9 percentage points higher than Staple on OTB50 and OTB100 respectively, which proves the effectiveness of the proposed algorithm for target in-plane rotation and partial occlusion.

Reference | Related Articles | Metrics

Select

Face super-resolution via very deep convolutional neural network

SUN Yitang, SONG Huihui, ZHANG Kaihua, YAN Fei

Journal of Computer Applications 2018, 38 (4): 1141-1145. DOI: 10.11772/j.issn.1001-9081.2017092378

Abstract （627）

PDF （890KB）（511）

Save

For multiple scale factors of face super-resolution, a face super-resolution method based on very deep convolutional neural network was proposed; and through experiments, it was found that the increase of network depth can effectively improve the accuracy of face reconstruction. Firstly, a network that consists of 20 convolution layers were designed to learn an end-to-end mapping between the low-resolution images and the high-resolution images, and many small filters were cascaded to extract more textural information. Secondly, a residual-learning method was introduced to solve the problem of detail information loss caused by increasing depth. In addition, the low-resolution face images with multiple scale factors were merged to one training set to enable the network to achieve the face super resolution with multiple scale factors. The results on the CASPEAL test dataset show that the proposed method based on this very deep convolutional neural network has 2.7 dB increasement in Peak Signal-to-Noise Ratio (PSNR), and 2% increasement in structural similarity compared to the Bicubic based face reconstruction method. Compared with the SRCNN method, there is also a greater improvement. as well as a greater improvement in accuracy and visual improvement. It means that deeper network structures can achieve better results in reconstruction.

Reference | Related Articles | Metrics

Select

Night-time vehicle detection based on Gaussian mixture model and AdaBoost

CHEN Yan, YAN Teng, SONG Junfang, SONG Huansheng

Journal of Computer Applications 2018, 38 (1): 260-263. DOI: 10.11772/j.issn.1001-9081.2017071763

Abstract （417）

PDF （819KB）（296）

Save

Focusing on the issue that the accuracy of night-time vehicle detection is relatively low, a method of accurately detecting the night-time vehicles by constructing a Gaussian Mixture Model (GMM) for the geometric relationship of the headlights and an AdaBoost (Adaptive Boosting) classifier using inverse projected vehicle samples was proposed. Firstly, the inverse projection plane was set according to the spatial position relation of the headlights in the traffic scene, and the headlights area was roughly positioned by the image preprocessing. Secondly, the geometrical relationship of the headlights was used to construct the GMM with the inverse projected images, and the headlights were initially matched. Finally, the vehicles were detected by using the AdaBoost classifier for inverse projected vehicle samples. In the comparison experiments with the AdaBoost classifier for the original image, the proposed method increased detection rate by 1.93%, decreased omission ratio by 17.83%, decreased false detection rate by 27.61%. Compared with D-S (Dempster-Shafer) evidence theory method, the proposed method increased detection rate by 2.03%, decreased omission ratio by 7.58%, decreased false detection rate by 47.51%. The proposed method can effectively improve the relative detection accuracy, reduces the interference of ground reflection and shadow, and satisfies the requirements of reliability and accuracy of night-time vehicle detection in traffic scene.

Reference | Related Articles | Metrics

Select

Unsupervised video segmentation by fusing multiple spatio-temporal feature representations

LI Xuejun, ZHANG Kaihua, SONG Huihui

Journal of Computer Applications 2017, 37 (11): 3134-3138. DOI: 10.11772/j.issn.1001-9081.2017.11.3134

Abstract （537）

PDF （1045KB）（471）

Save

Due to random movement of the segmented target, rapid change of background, arbitrary variation and shape deformation of object appearance, in this paper, a new unsupervised video segmentation algorithm based on multiple spatial-temporal feature representations was presented. By combination of salient features and other features obtained from pixels and superpixels, a coarse-to-fine-grained robust feature representation was designed to represent each frame in a video sequence. Firstly, a set of superpixels was generated to represent foreground and background in order to improve computational efficiency and get segmentation results by graph-cut algorithm. Then, the optical flow method was used to propagate information between adjacent frames, and the appearance of each superpixel was updated by its non-local sptatial-temporal features generated by nearest neighbor searching method with efficient K-Dimensional tree (K-D tree) algorithm, so as to improve robustness of segmentation. After that, for segmentation results generated in superpixel-level, a new Gaussian mixture model based on pixels was constructed to achieve pixel-level refinement. Finally, the significant feature of image was introduced, as well as segmentation results generated by graph-cut and Gaussian mixture model, to obtain more accurate segmentation results by voting scheme. The experimental results show that the proposed algorithm is a robust and effective segmentation algorithm, which is superior to most unsupervised video segmentation algorithms and some semi-supervised video segmentation algorithms.

Reference | Related Articles | Metrics

Select

Simple method to improve the iterative detection convergence of SCCRFQPSK

ZHANG Gaoyuan WEN Hong SONG Huanhuan LI Tengfei

Journal of Computer Applications 2014, 34 (9): 2486-2490. DOI: 10.11772/j.issn.1001-9081.2014.09.2486

Abstract （151）

PDF （739KB）（337）

Save

The Maximum-A-Posteriori-probability (MAP) demodulation of recursive FQPSK-B in the presence of Additive White Gaussian Noise (AWGN) channel was first presented. Required in the iterative detection of Serial Concatenation of Convolutional coded Recursive FQPSK (SCCRFQPSK), the bit extrinsic Log-Likelihood Ratio (ex-LLR) of FQPSK demodulation was also derived. Secondly, aiming at weakening the phenomena of positive feedback during the iterative detection of SCCRFQPSK, the bit ex-LLR of FQPSK demodulation was appropriately adjusted by linear weighted processing. By Monte Carlo simulation, it was concluded that the optimal weighting factor of the weighted SCCRFQPSK system was 0.7, and it got 0.3dB Signal-to-Noise Ratio (SNR) gain at a Bit Error Rate (BRE) of 10-5 at 4 iterations. The simulation results indicate that the proposed method can not only accelerate the decoding convergence and improve the performance of the SCCRFQPSK system, but also reduce the delay. To a certain extent, it can deal with the deep space communication with low SNR caused by long distance.

Reference | Related Articles | Metrics

Select

Simple efficient bit-flipping decoding algorithm for low density parity check code

ZHANG Gaoyuan WEN Hong LI Tengfei SONG Huanhuan

Journal of Computer Applications 2014, 34 (10): 2796-2799. DOI: 10.11772/j.issn.1001-9081.2014.10.2796

Abstract （630）

PDF （625KB）（384）

Save

To improve the efficiency of the Bit Flipping (BF), a weighted gradient descent bit-flipping decoding algorithm based on average magnitude was proposed for Low Density Parity Check (LDPC) code. The average magnitude of the information nodes was first introduced as the reliability of the parity checks, which was used to weigh the bipolar syndrome, and then an effective bit-flipping function was obtained. Simulation was conducted at Bit-Error Rate (BER) of 10-5 under an Additive White Gaussian Noise (AWGN) channel, and coding gains of 0.08 and 0.29 dB were achieved in comparison to conventional weighted Gradient Descent Bit-Flipping (GDBF) and Reliability Ratio based Weighted Gradient Descent Bit-Flipping (RRWGDBF) algorithms while the average number of decoding iterations was reduced by 72.6% and 9.3%, respectively. The simulation results show that the improved algorithm outperforms the conventional algorithms while average decoding number is also reduced. It indicates that this new scheme can better balance error-correcting ability, decoding complexity and delay, which can be applied to high-speed communication system with high real-time requirement.

Reference | Related Articles | Metrics

Select

Multiplicative watermarking algorithm based on wavelet visual model

Er-song HUANG Jin-hua LIU Ru-hong WEN

Journal of Computer Applications 2011, 31 (08): 2165-2168. DOI: 10.3724/SP.J.1087.2011.02165

Abstract （1293）

PDF （832KB）（857）

Save

The additive watermarking algorithm has good imperceptibility, however, the robustness of watermark is poor. As a result, a multiplicative image watermarking method was proposed by combining the visual model in the wavelet domain. In the proposed embedding scheme, the middle-frequency subband acted as the watermark embedding space, which was used to achieve the tradeoff between the imperceptibility and the robustness of watermarking system. Besides, the embedding strength factor was determined by considering the frequency masking, luminance masking and texture masking of host image. In the proposed detection scheme, the probability density function of wavelet coefficients was modeled by the Generalized Gaussian Distribution (GGD), and the watermark decision threshold was obtained by using the Neyman-Pearson (NP) criterion, and the Receiver Operating Characteristic (ROC) curve between the probability of false alarm and the probability of detection was derived. Finally, the robustness of the proposed watermarking was tested when being against common image processing attacks such as JPEG compression, Additive White Gaussian Noise (AWGN), scaling and cropping. The experimental results demonstrate that the proposed method has good detection performance and good robustness.

Reference | Related Articles | Metrics

Select

Agent-based cooperative spectrum sensing algorithm

YE Qingsong HUI Xiaowei

Journal of Computer Applications 2011, 31 (06): 1480-1482. DOI: 10.3724/SP.J.1087.2011.01480

Abstract （1246）

Save

To improve the spectrum sensing performance of cognitive radio technology, in this paper, a new Agent-based cooperative spectrum sensing algorithm was proposed. This algorithm used multiple local energy detection threshold in the local detection, while the Signal-to-Noise Ratio (SNR) estimated by the cognitive user was sent to the main control center of Agent, then the control center based on SNR and the distance between the transmitter and cognitive nodes to balance, to select the cognitive nodes with high reliability and validity of to decision fusion. The simulation results show that the algorithm can improve the cooperative spectrum sensing capabilities of cognitive radio networks, and at the same time reduce the number of nodes involved in the original perception of cooperative sensing algorithm to some extent.